Accelerating R-based Analytics on the Cloud

نویسندگان

  • Ishan Patel
  • Andrew Rau-Chaplin
  • Blesson Varghese
چکیده

This paper addresses the problem of harnessing cloud-based infrastructure for the kind of analytical workloads that abound in application domains ranging from computational finance and risk analytics to engineering and manufacturing. Often in this setting, software is not developed by a professional programmer, but on an ad hoc basis by Analysts in high-level programming environments such as R or Matlab. The goal is to allow Analysts to take an analytical job, including both the software and associated data that executes on their personal workstations, and with minimum effort execute the software on a largescale parallel cloud infrastructure and manage both the resources and the data required by the job. If this can be facilitated gracefully, then the Analyst can not only benefit from experimenting with large-scale analytical problems in less time but also from on-demand resources, low maintenance cost and scalability of computing resources, all of which are offered by the cloud. In this paper, a Platform for Parallel R-based Analytics on the Cloud (P2RAC) that is placed between an Analyst and a cloud infrastructure is proposed and implemented. P2RAC offers a set of command-line tools for managing the resources, such as instances and clusters, the data and the execution of the software on the Amazon Elastic Computing Cloud infrastructure. Experimental studies are pursued on two parallel problems and the results obtained confirm the feasibility of employing P2RAC for solving large-scale analytical problems on the cloud. Copyright c © 2010 John Wiley & Sons, Ltd.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Application of Big Data Analytics in Power Distribution Network

Smart grid enhances optimization in generation, distribution and consumption of the electricity by integrating information and communication technologies into the grid. Today, utilities are moving towards smart grid applications, most common one being deployment of smart meters in advanced metering infrastructure, and the first technical challenge they face is the huge volume of data generated ...

متن کامل

Big Data Analytics and Now-casting: A Comprehensive Model for Eventuality of Forecasting and Predictive Policies of Policy-making Institutions

The ability of now-casting and eventuality is the most crucial and vital achievement of big data analytics in the area of policy-making. To recognize the trends and to render a real image of the current condition and alarming immediate indicators, the significance and the specific positions of big data in policy-making are undeniable. Moreover, the requirement for policy-making institutions to ...

متن کامل

RESCUE: Reputation based Service for Cloud User Environment

Exceptional characteristics of Cloud computing has replaced all traditional computing. With reduced resource management and without in-advance investment, it has been victorious in making the IT world to migrate towards it. Microsoft announced its office package as Cloud, which can prevent people moving from Windows to Linux. As this drift is escalating in an exponential rate, the cloud environ...

متن کامل

Operational Visibility and Security Analytics Designed for Cloud

We are witnessing a major shift in the foundations of computing. There is an accelerating growth in the scale and the variety of cloud platforms, emerging runtimes, and new programming models. Cloud computing discussion has shifted from mere utility and density to cloud-native services and design patterns. Emerging cloud services allow users to define and provision complex, distributed systems ...

متن کامل

Sustainability Data and Analytics in Cloud-Based M2M Systems

Recently, cloud computing technologies have been employed for largescale machine-to-machine (M2M) systems, as they could potentially offer better solutions for managing monitoring data of IoTs (Internet of Things) and supporting rich sets of IoT analytics applications for different stakeholders. However, there exist complex relationships between monitored objects, monitoring data, analytics fea...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Concurrency and Computation: Practice and Experience

دوره 28  شماره 

صفحات  -

تاریخ انتشار 2016